Grid Data Streaming
نویسندگان
چکیده
Flourish development of grid computing have been seen in recent years which has enabled researchers to collaborate more efficiently by sharing computing, storage, and instrumental resources. Data grids focusing on large-scale data sharing and processing are most popular and data management is one of most essential functionalities in a grid software infrastructure. Applications such as astronomical observations, large-scale numerical simulation, and sensor networks generate more and more data, which constitutes great challenges to storage and processing capabilities. Most of these data intensive applications can be considered as data stream processing with fixed processing patterns and periodical looping. Grid data streaming management is gaining more and more attention in the grid community. In this work, a detailed survey of current grid data streaming research efforts is provided and features of corresponding implementations are summarized. While traditional grid data management systems provide functions like data transfers, placements and locating, data streaming in a grid environment requires additional supports, e.g. data cleanup and transfer scheduling. For example, at storage-constraint grid nodes, data can be streamed, made available to corresponding applications in an on-demand manner, and finally cleaned up after processing is completed. Grid data streaming management is particularly essential to enable grid applications on CPU-rich but storage-limit grid nodes. In this work, a grid data streaming environment is proposed with detailed system analysis and design. Several additional modules, e.g. performance sensors, predictors and schedulers, are implemented. Initial experimental results show that data streaming leads to a better utilization of data storage and improves system performance significantly. Key Words—Grid computing, data streams, and data streaming applications. * E-mail: [email protected]. This work is funded by the Ministry of Education of China under the quality engineering program for higher education and the Ministry of Science and Technology of China under the national 863 high-tech R&D program (grant No. 2006AA10Z237). Wen Zhang, Junwei Cao, Lianchen Liu, and Cheng Wu 2
منابع مشابه
Grid Resource Management and Scheduling for Data Streaming Applications 1001 GRID RESOURCE MANAGEMENT AND SCHEDULING FOR DATA STREAMING APPLICATIONS
Data streaming applications bring new challenges to resource management and scheduling for grid computing. Since real-time data streaming is required as data processing is going on, integrated grid resource management becomes essential among processing, storage and networking resources. Traditional scheduling approaches may not be sufficient for such applications, since usually only one aspect ...
متن کاملStreaming data between Web Services. Comparison of streaming protocols over a stream-enabled Web Service
The ability to stream data between web-services is vital for the implementation of complex workflows on the Grid. In this work we investigate various protocols as to their suitability for streaming, taking into consideration issues such as security, reliability and speed over different Grid configurations. To perform these comparisons we have implemented a Server/Client web service that provide...
متن کاملGrid Resource Management and Scheduling for Data Streaming Applications
Data streaming applications bring new challenges to resource management and scheduling for grid computing. Since real-time data streaming is required as data processing is going on, integrated grid resource management becomes essential among processing, storage and networking resources. Traditional scheduling approaches may not be sufficient for such applications, since usually only one aspect ...
متن کاملLinked Data and Complex Event Processing for the Smart Energy Grid
The Smart Grid aims at making the current energy grid more efficient and eco-friendly. The Smart Grid features an IT-layer, which allows communication between a multitude of stakeholders and will have to be integrated with other “smart” systems (e.g., smart factories or smart cities) to operate effectively. Thus, many participants will be involved and will exchange large volumes of data, leadin...
متن کاملFuzzy Data Envelopment Analysis for Classification of Streaming Data
The classification of fuzzy uncertain data is considered one of the most challenging issues in data analysis. In spite of the significance of fuzzy data in mathematical programming, the development of the analytical methods of fuzzy data is slow. Therefore, the current study proposes a new fuzzy data classification method based on fuzzy data envelopment analysis (DEA) which can handle strea...
متن کامل